Search CORE

142 research outputs found

Electroplating Zn on Mg as a biodegradable material

Author: Fu Zihao
Publication venue: 'University of Queensland Library'
Publication date: 01/01/2018
Field of study

Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization

Author: Collier Nigel
Fu Zihao
Meng Zaiqiao
Su Yixuan
Publication venue
Publication date: 13/10/2023
Field of study

Biomedical named entity recognition is one of the core tasks in biomedical natural language processing (BioNLP). To tackle this task, numerous supervised/distantly supervised approaches have been proposed. Despite their remarkable success, these approaches inescapably demand laborious human effort. To alleviate the need of human effort, dictionary-based approaches have been proposed to extract named entities simply based on a given dictionary. However, one downside of existing dictionary-based approaches is that they are challenged to identify concept synonyms that are not listed in the given dictionary, which we refer as the synonym generalization problem. In this study, we propose a novel Synonym Generalization (SynGen) framework that recognizes the biomedical concepts contained in the input text using span-based predictions. In particular, SynGen introduces two regularization terms, namely, (1) a synonym distance regularizer; and (2) a noise perturbation regularizer, to minimize the synonym generalization error. To demonstrate the effectiveness of our approach, we provide a theoretical analysis of the bound of synonym generalization error. We extensively evaluate our approach on a wide range of benchmarks and the results verify that SynGen outperforms previous dictionary-based models by notable margins. Lastly, we provide a detailed analysis to further reveal the merits and inner-workings of our approach

arXiv.org e-Print Archive

A Theoretical Analysis of the Repetition Problem in Text Generation

Author: Fu Zihao
Lam Wai
Shi Bei
So Anthony Man-Cho
Publication venue
Publication date: 21/03/2021
Field of study

Text generation tasks, including translation, summarization, language models, and etc. see rapid growth during recent years. Despite the remarkable achievements, the repetition problem has been observed in nearly all text generation models undermining the generation performance extensively. To solve the repetition problem, many methods have been proposed, but there is no existing theoretical analysis to show why this problem happens and how it is resolved. In this paper, we propose a new framework for theoretical analysis for the repetition problem. We first define the Average Repetition Probability (ARP) to characterize the repetition problem quantitatively. Then, we conduct an extensive analysis of the Markov generation model and derive several upper bounds of the average repetition probability with intuitive understanding. We show that most of the existing methods are essentially minimizing the upper bounds explicitly or implicitly. Grounded on our theory, we show that the repetition problem is, unfortunately, caused by the traits of our language itself. One major reason is attributed to the fact that there exist too many words predicting the same word as the subsequent word with high probability. Consequently, it is easy to go back to that word and form repetitions and we dub it as the high inflow problem. Furthermore, we derive a concentration bound of the average repetition probability for a general generation model. Finally, based on the theoretical upper bounds, we propose a novel rebalanced encoding approach to alleviate the high inflow problem. The experimental results show that our theoretical framework is applicable in general generation models and our proposed rebalanced encoding approach alleviates the repetition problem significantly. The source code of this paper can be obtained from https://github.com/fuzihaofzh/repetition-problem-nlg.Comment: AAAI 21 Paper with Appendi

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference

Author: Cui Bin
Fu Fangcheng
Li Haoyang
Miao Xupeng
Yu Zihao
Publication venue
Publication date: 27/05/2023
Field of study

Due to the recent success of diffusion models, text-to-image generation is becoming increasingly popular and achieves a wide range of applications. Among them, text-to-image editing, or continuous text-to-image generation, attracts lots of attention and can potentially improve the quality of generated images. It's common to see that users may want to slightly edit the generated image by making minor modifications to their input textual descriptions for several rounds of diffusion inference. However, such an image editing process suffers from the low inference efficiency of many existing diffusion models even using GPU accelerators. To solve this problem, we introduce Fast Image Semantically Edit (FISEdit), a cached-enabled sparse diffusion model inference engine for efficient text-to-image editing. The key intuition behind our approach is to utilize the semantic mapping between the minor modifications on the input text and the affected regions on the output image. For each text editing step, FISEdit can automatically identify the affected image regions and utilize the cached unchanged regions' feature map to accelerate the inference process. Extensive empirical results show that FISEdit can be

3.4\times

and

4.4\times

faster than existing methods on NVIDIA TITAN RTX and A100 GPUs respectively, and even generates more satisfactory images.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

Grand canonical Monte Carlo simulation on adsorption of aniline on the ice surface

Author: Chen Jingwen
Fu Zhiqiang
Fu Zihao
He Ning
Liu Cong
Xie Hong-Bin
Zhou Putian
Publication venue
Publication date: 15/09/2019
Field of study

Aniline has been found to have frequent environmental occurrence and high toxicity. However, little study has been performed on its environmental fate. Here, we employed Grand Canonical Monte Carlo simulations (GCMC) to investigate the adsorption behavior of aniline on hexagonal ice surface at 200 K using our modified force field of aniline and TIP5P force field of water. The results indicate that the adsorption isotherm of aniline exhibits a “monolayer saturation plateau”, starting with a rapid increase, then a plateau, and finally a condensed phase. Under very low surface coverage, the adsorption isotherm apparently follows Langmuir type adsorption isotherm although anilines can be adsorbed to various sites. Within the range of the apparent Langmuir-type adsorption isotherm, adsorbed anilines are independent from each other and most anilines are almost parallel to the ice surface and form two N−H•••O hydrogen bonds. With the increase of coverage, the adsorbed anilines can interact with each other, resulting in the deviation from the apparent Langmuir-type adsorption isotherm. In addition, the adsorption energy from GCMC simulation (−65.91 kJ mol−1) is well consistent that from our validating quantum chemistry calculation (−69.34 kJ mol−1), further confirming the reliability of our GCMC simulation results.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

BAND: Biomedical Alert News Dataset

Author: Buckeridge David
Collier Nigel
Fu Zihao
Meng Zaiqiao
Shen Yannan
Zhang Meiru
Publication venue
Publication date: 15/10/2023
Field of study

Infectious disease outbreaks continue to pose a significant threat to human health and well-being. To improve disease surveillance and understanding of disease spread, several surveillance systems have been developed to monitor daily news alerts and social media. However, existing systems lack thorough epidemiological analysis in relation to corresponding alerts or news, largely due to the scarcity of well-annotated reports data. To address this gap, we introduce the Biomedical Alert News Dataset (BAND), which includes 1,508 samples from existing reported news articles, open emails, and alerts, as well as 30 epidemiology-related questions. These questions necessitate the model's expert reasoning abilities, thereby offering valuable insights into the outbreak of the disease. The BAND dataset brings new challenges to the NLP world, requiring better disguise capability of the content and the ability to infer important information. We provide several benchmark tasks, including Named Entity Recognition (NER), Question Answering (QA), and Event Extraction (EE), to show how existing models are capable of handling these tasks in the epidemiology domain. To the best of our knowledge, the BAND corpus is the largest corpus of well-annotated biomedical outbreak alert news with elaborately designed questions, making it a valuable resource for epidemiologists and NLP researchers alike

arXiv.org e-Print Archive

Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder

Author: Collier Nigel
Fu Zihao
Hu Shengding
Lam Wai
Liu Zhiyuan
So Anthony Man-Cho
Yu Qian
Publication venue
Publication date: 08/04/2023
Field of study

The sequence-to-sequence (seq2seq) task aims at generating the target sequence based on the given input source sequence. Traditionally, most of the seq2seq task is resolved by the Encoder-Decoder framework which requires an encoder to encode the source sequence and a decoder to generate the target text. Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task. Despite the significant advancements in applying language models to the seq2seq task, there is still a lack of thorough analysis on the effectiveness of the decoder-only language model architecture. This paper aims to address this gap by conducting a detailed comparison between the encoder-decoder architecture and the decoder-only language model framework through the analysis of a regularized encoder-decoder structure. This structure is designed to replicate all behaviors in the classical decoder-only language model but has an encoder and a decoder making it easier to be compared with the classical encoder-decoder structure. Based on the analysis, we unveil the attention degeneration problem in the language model, namely, as the generation step number grows, less and less attention is focused on the source sequence. To give a quantitative understanding of this problem, we conduct a theoretical sensitivity analysis of the attention output with respect to the source input. Grounded on our analysis, we propose a novel partial attention language model to solve the attention degeneration problem. Experimental results on machine translation, summarization, and data-to-text generation tasks support our analysis and demonstrate the effectiveness of our proposed model

arXiv.org e-Print Archive